{
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "mQcpAkYcHXfz"
      },
      "source": [
        "# 📓 LlamaIndex Quickstart\n",
        "\n",
        "In this quickstart you will create a simple Llama Index app and learn how to log it and get feedback on an LLM response.\n",
        "\n",
        "You'll also learn how to use feedbacks for guardrails, via filtering retrieved context.\n",
        "\n",
        "For evaluation, we will leverage the RAG triad of groundedness, context relevance and answer relevance.\n",
        "\n",
        "[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/truera/trulens/blob/main/examples/quickstart/llama_index_quickstart.ipynb)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "37NJNLnrHXf5"
      },
      "source": [
        "## Setup\n",
        "\n",
        "### Install dependencies\n",
        "Let's install some of the dependencies for this notebook if we don't have them already"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "uk7xEvryHXf6"
      },
      "outputs": [],
      "source": [
        "# !pip install trulens trulens-apps-llamaindex trulens-providers-openai llama_index openai"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "siCcTkE_HXf9"
      },
      "source": [
        "### Add API keys\n",
        "For this quickstart, you will need an Open AI key. The OpenAI key is used for embeddings, completion and evaluation."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "DSIbUDmXHXf-"
      },
      "outputs": [],
      "source": [
        "import os\n",
        "\n",
        "os.environ[\"OPENAI_API_KEY\"] = \"sk-...\""
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "KtPfLY-bHXf-"
      },
      "source": [
        "### Import from TruLens"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "EqKjz6I8HXf-"
      },
      "outputs": [],
      "source": [
        "from trulens.core import TruSession\n",
        "\n",
        "session = TruSession()\n",
        "session.reset_database()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "zuWp9tuvHXf_"
      },
      "source": [
        "### Download data\n",
        "\n",
        "This example uses the text of Paul Graham’s essay, [“What I Worked On”](https://paulgraham.com/worked.html), and is the canonical llama-index example.\n",
        "\n",
        "The easiest way to get it is to [download it via this link](https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/paul_graham/paul_graham_essay.txt) and save it in a folder called data. You can do so with the following command:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "srI2bJSeHXgA"
      },
      "outputs": [],
      "source": [
        "import os\n",
        "import urllib.request\n",
        "\n",
        "url = \"https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/paul_graham/paul_graham_essay.txt\"\n",
        "file_path = \"data/paul_graham_essay.txt\"\n",
        "\n",
        "if not os.path.exists(\"data\"):\n",
        "    os.makedirs(\"data\")\n",
        "\n",
        "if not os.path.exists(file_path):\n",
        "    urllib.request.urlretrieve(url, file_path)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "0hc43gaDHXgA"
      },
      "source": [
        "### Create Simple LLM Application\n",
        "\n",
        "This example uses LlamaIndex which internally uses an OpenAI LLM."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "KAlL3UYEHXgA"
      },
      "outputs": [],
      "source": [
        "from llama_index.core import Settings\n",
        "from llama_index.core import SimpleDirectoryReader\n",
        "from llama_index.core import VectorStoreIndex\n",
        "from llama_index.llms.openai import OpenAI\n",
        "\n",
        "Settings.chunk_size = 128\n",
        "Settings.chunk_overlap = 16\n",
        "Settings.llm = OpenAI()\n",
        "\n",
        "documents = SimpleDirectoryReader(\"data\").load_data()\n",
        "index = VectorStoreIndex.from_documents(documents)\n",
        "\n",
        "query_engine = index.as_query_engine(similarity_top_k=3)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "UXNPHrTzHXgB"
      },
      "source": [
        "### Send your first request"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "iyiMd2u2HXgB"
      },
      "outputs": [],
      "source": [
        "response = query_engine.query(\"What did the author do growing up?\")\n",
        "print(response)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "lBEoL6Z_HXgB"
      },
      "source": [
        "## Initialize Feedback Function(s)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "dQOuvTJ2HXgB"
      },
      "outputs": [],
      "source": [
        "import numpy as np\n",
        "from trulens.apps.llamaindex import TruLlama\n",
        "from trulens.core import Feedback\n",
        "from trulens.providers.openai import OpenAI\n",
        "\n",
        "# Initialize provider class\n",
        "provider = OpenAI()\n",
        "\n",
        "# select context to be used in feedback. the location of context is app specific.\n",
        "\n",
        "context = TruLlama.select_context(query_engine)\n",
        "\n",
        "# Define a groundedness feedback function\n",
        "f_groundedness = (\n",
        "    Feedback(\n",
        "        provider.groundedness_measure_with_cot_reasons, name=\"Groundedness\"\n",
        "    )\n",
        "    .on(context.collect())  # collect context chunks into a list\n",
        "    .on_output()\n",
        ")\n",
        "\n",
        "# Question/answer relevance between overall question and answer.\n",
        "f_answer_relevance = Feedback(\n",
        "    provider.relevance_with_cot_reasons, name=\"Answer Relevance\"\n",
        ").on_input_output()\n",
        "# Question/statement relevance between question and each context chunk.\n",
        "f_context_relevance = (\n",
        "    Feedback(\n",
        "        provider.context_relevance_with_cot_reasons, name=\"Context Relevance\"\n",
        "    )\n",
        "    .on_input()\n",
        "    .on(context)\n",
        "    .aggregate(np.mean)\n",
        ")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "m_8paPbzHXgB"
      },
      "source": [
        "## Instrument app for logging with TruLens"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "vqMrn5O9HXgC"
      },
      "outputs": [],
      "source": [
        "tru_query_engine_recorder = TruLlama(\n",
        "    query_engine,\n",
        "    app_name=\"LlamaIndex_App\",\n",
        "    app_version=\"base\",\n",
        "    feedbacks=[f_groundedness, f_answer_relevance, f_context_relevance],\n",
        ")"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "hU2EDcJlHXgC"
      },
      "outputs": [],
      "source": [
        "# or as context manager\n",
        "with tru_query_engine_recorder as recording:\n",
        "    query_engine.query(\"What did the author do growing up?\")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "niAiyEqNHXgC"
      },
      "source": [
        "## Use guardrails\n",
        "\n",
        "In addition to making informed iteration, we can also directly use feedback results as guardrails at inference time. In particular, here we show how to use the context relevance score as a guardrail to filter out irrelevant context before it gets passed to the LLM. This both reduces hallucination and improves efficiency.\n",
        "\n",
        "Below, you can see the TruLens feedback display of each context relevance chunk retrieved by our RAG."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "aruZRnWWHXgC"
      },
      "outputs": [],
      "source": [
        "from trulens.dashboard.display import get_feedback_result\n",
        "\n",
        "last_record = recording.records[-1]\n",
        "get_feedback_result(last_record, \"Context Relevance\")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Rff3QwnWHXgC"
      },
      "source": [
        "Wouldn't it be great if we could automatically filter out context chunks with relevance scores below 0.5?\n",
        "\n",
        "We can do so with the TruLens guardrail, *WithFeedbackFilterNodes*. All we have to do is use the method `of_query_engine` to create a new filtered retriever, passing in the original retriever along with the feedback function and threshold we want to use."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "81ggGUehHXgD"
      },
      "outputs": [],
      "source": [
        "from trulens.apps.llamaindex.guardrails import WithFeedbackFilterNodes\n",
        "\n",
        "# note: feedback function used for guardrail must only return a score, not also reasons\n",
        "f_context_relevance_score = Feedback(provider.context_relevance)\n",
        "\n",
        "filtered_query_engine = WithFeedbackFilterNodes(\n",
        "    query_engine, feedback=f_context_relevance_score, threshold=0.5\n",
        ")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "XAHr_PTAHXgD"
      },
      "source": [
        "Then we can operate as normal"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "5EYzKMUQHXgD"
      },
      "outputs": [],
      "source": [
        "tru_recorder = TruLlama(\n",
        "    filtered_query_engine,\n",
        "    app_name=\"LlamaIndex_App\",\n",
        "    app_version=\"filtered\",\n",
        "    feedbacks=[f_answer_relevance, f_context_relevance, f_groundedness],\n",
        ")\n",
        "\n",
        "with tru_recorder as recording:\n",
        "    llm_response = filtered_query_engine.query(\n",
        "        \"What did the author do growing up?\"\n",
        "    )\n",
        "\n",
        "display(llm_response)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "DmvWItyhHXgD"
      },
      "source": [
        "## See the power of context filters!\n",
        "\n",
        "If we inspect the context relevance of our retrieval now, you see only relevant context chunks!"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "Mgecr95XHXgD"
      },
      "outputs": [],
      "source": [
        "from trulens.dashboard.display import get_feedback_result\n",
        "\n",
        "last_record = recording.records[-1]\n",
        "get_feedback_result(last_record, \"Context Relevance\")"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "azjH0hpBHXgD"
      },
      "outputs": [],
      "source": [
        "session.get_leaderboard()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "j9tQKlkpHXgD"
      },
      "source": [
        "## Retrieve records and feedback"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "qroJ59tPHXgE"
      },
      "outputs": [],
      "source": [
        "# The record of the app invocation can be retrieved from the `recording`:\n",
        "\n",
        "rec = recording.get()  # use .get if only one record\n",
        "# recs = recording.records # use .records if multiple\n",
        "\n",
        "display(rec)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "2ShVtz5JHXgE"
      },
      "outputs": [],
      "source": [
        "from trulens.dashboard import run_dashboard\n",
        "\n",
        "run_dashboard(session)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "L8ZtG32GHXgE"
      },
      "outputs": [],
      "source": [
        "# The results of the feedback functions can be rertireved from\n",
        "# `Record.feedback_results` or using the `wait_for_feedback_result` method. The\n",
        "# results if retrieved directly are `Future` instances (see\n",
        "# `concurrent.futures`). You can use `as_completed` to wait until they have\n",
        "# finished evaluating or use the utility method:\n",
        "\n",
        "for feedback, feedback_result in rec.wait_for_feedback_results().items():\n",
        "    print(feedback.name, feedback_result.result)\n",
        "\n",
        "# See more about wait_for_feedback_results:\n",
        "# help(rec.wait_for_feedback_results)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "V9rUnalIHXgE"
      },
      "outputs": [],
      "source": [
        "records, feedback = session.get_records_and_feedback()\n",
        "\n",
        "records.head()"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "mWotvxUPHXgE"
      },
      "outputs": [],
      "source": [
        "session.get_leaderboard()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "uSDBdPHNHXgE"
      },
      "source": [
        "## Explore in a Dashboard"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "6ObLpst_HXgE"
      },
      "outputs": [],
      "source": [
        "run_dashboard(session)  # open a local streamlit app to explore\n",
        "\n",
        "# stop_dashboard(session) # stop if needed"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "qRGgCrMKHXgF"
      },
      "source": [
        "Alternatively, you can run `trulens` from a command line in the same folder to start the dashboard."
      ]
    }
  ],
  "metadata": {
    "kernelspec": {
      "display_name": "Python 3.11.4 ('agents')",
      "language": "python",
      "name": "python3"
    },
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3"
    },
    "vscode": {
      "interpreter": {
        "hash": "7d153714b979d5e6d08dd8ec90712dd93bff2c9b6c1f0c118169738af3430cd4"
      }
    },
    "colab": {
      "provenance": []
    }
  },
  "nbformat": 4,
  "nbformat_minor": 0
}